-
Notifications
You must be signed in to change notification settings - Fork 4.1k
roachtest: beef up jobs/stress roachtest #158304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
10ef427 to
3006d24
Compare
Potential Bug(s) DetectedThe three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation. Next Steps: Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary. After you review the findings, please tag the issue as follows:
|
897a4bf to
c2a36a1
Compare
|
this test sufficiently stresses the job system, as the job adoption rate is significantly decreased (but not flatlined) we also see plenty of logs indicating claim query timeouts (added in #160084), for example: |
This patch reconfigures the jobs/stress roachtest logic in a few ways: - now runs on a 20 node cluster in the weekly suite - lowers the base interval to 0.1, to simulate internal job query contention at a 200 node scale - refines the job control loop logic to pause 20% of running changefeeds, resume 20% of paused changefeeds, recreate up to 200 canceled changefeeds, per iteration. - fails the roachtest if a job stays unclaimed for more than 5 minutes. Informs: cockroachdb#158976 Release note: none
c2a36a1 to
90b6904
Compare
|
TFTR! bors r=dt |
|
Build succeeded: |

This patch reconfigures the jobs/stress roachtest logic in a few ways:
a 200 node scale
resume 20% of paused changefeeds, recreate up to 200 canceled changefeeds,
per iteration.
Informs: #158976
Release note: none